Word Sense Disambiguation Using Sense Examples Automatically Acquired from a Second Language

نویسندگان

  • Xinglong Wang
  • John A. Carroll
چکیده

We present a novel almost-unsupervised approach to the task of Word Sense Disambiguation (WSD). We build sense examples automatically, using large quantities of Chinese text, and English-Chinese and Chinese-English bilingual dictionaries, taking advantage of the observation that mappings between words and meanings are often different in typologically distant languages. We train a classifier on the sense examples and test it on a gold standard English WSD dataset. The evaluation gives results that exceed previous state-of-the-art results for comparable systems. We also demonstrate that a little manual effort can improve the quality of sense examples, as measured by WSD accuracy. The performance of the classifier on WSD also improves as the number of training sense examples increases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DALE: A Word Sense Disambiguation System for Biomedical Documents Trained using Automatically Labeled Examples

Automatic interpretation of documents is hampered by the fact that language contains terms which have multiple meanings. These ambiguities can still be found when language is restricted to a particular domain, such as biomedicine. Word Sense Disambiguation (WSD) systems attempt to resolve these ambiguities but are often only able to identify the meanings for a small set of ambiguous terms. DALE...

متن کامل

Word Sense Disambiguation Using Semi-Supervised Naive Bayes with Ontological Constraints

Background. Word sense disambiguation (WSD) is the task of mapping an ambiguous word to its correct sense given its context. As high-quality sensetagged data is scarce and expensive to obtain, attention has shifted from fullysupervised to semi-supervised and knowledge-based approaches to WSD that rely on a lexical knowledge base such as WordNet instead of large amounts of hand-labeled data. Wha...

متن کامل

Analogical Word Sense Disambiguation

Word sense disambiguation is an important problem in learning by reading. This paper introduces analogical word-sense disambiguation, which uses human-like analogical processing over structured, relational representations to perform word sense disambiguation. Cases are automatically constructed using representations produced via natural language analysis of sentences, and include both conceptua...

متن کامل

Combining Machine Readable Lexical Resources and Bilingual Corpora for Broad Word Sense Disambiguation

This paper describes a new approach to word sense disambiguation (WSD) based on automatically acquired "word sense division. The semantically related sense entries in a bilingual dictionary are arranged in clusters using a heuristic labeling algorithm to provide a more complete and appropriate sense division for WSD. Multiple translations of senses serve as outside information for automatic tag...

متن کامل

Evaluating large-scale Knowledge Resources across Languages

This paper presents an empirical evaluation in a multilingual scenario of the semantic knowledge present on publicly available large-scale knowledge resources. The study covers a wide range of manually and automatically derived large-scale knowledge resources for English and Spanish. In order to establish a fair and neutral comparison, the knowledge resources are evaluated using the same method...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005